-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-104212: Add importlib.util.load_source_path() function #105755
Conversation
imp.load_source() has a 3rd parameter file to pass a file-like object, whereas my proposed function has only 2 arguments. Would it be worth it to implement this parameter? imp.load_source() uses this special loader to implement this parameter: class _HackedGetData:
"""Compatibility support for 'file' arguments of various load_*()
functions."""
def __init__(self, fullname, path, file=None):
super().__init__(fullname, path)
self.file = file
def get_data(self, path):
"""Gross hack to contort loader to deal w/ load_*()'s bad API."""
if self.file and path == self.path:
# The contract of get_data() requires us to return bytes. Reopen the
# file in binary mode if needed.
if not self.file.closed:
file = self.file
if 'b' not in file.mode:
file.close()
if self.file.closed:
self.file = file = open(self.path, 'rb')
with file:
return file.read()
else:
return super().get_data(path)
class _LoadSourceCompatibility(_HackedGetData, machinery.SourceFileLoader):
"""Compatibility support for implementing load_source().""" |
@warsaw @brettcannon: What do you think of this late Python 3.12 addition? I propose to add it after beta1 since I consider that it's a regression, not a new feature: it replaces removed imp.load_source(). I thought that imp.load_source() supported file-like objects which are not on disk, like ZIP archives or anything else. But the imp implementation can reopen the file from its filename, so I'm not really convinced by its usefulness. |
It's a new feature so I wouldn't want it slipped into Python 3.12 (just because imp had it doesn't mean it was a good idea 😉). Plus the code required to do this is already documented at https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly and it's 4 lines. |
Well, there are projects who need this function: see #104212.
This recipe doesn't work if the filename doesn't end with A recipe in the doc is nice, but people (like me) are unable to find it and so come up with their own bad implementation: https://stackoverflow.com/questions/69884878/replacing-imp-with-importlib-maintaining-old-behavior Even if the code is short, the code already evolved once: before, load_module() had to be used, and now this method is gone and exec_module() must be used instead. I'm not going to push too hard to have such function in importlib: I have my own flavor of this function in my project. I already updated it a few time to follow new importlib "trends" and follow deprecations :-) (The first implementation used imp obviously.) |
And they are incorrect. 😉 That's only an issue if you don't specify the loader does it fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to decide whether this is meant to only import a module and be as simple as possible, or can you also import a package?
Regardless, it's missing a module spec on the module which is critical.
It also needs to be decided on whether this is necessary.
Lib/importlib/util.py
Outdated
@@ -246,3 +247,13 @@ def exec_module(self, module): | |||
loader_state['__class__'] = module.__class__ | |||
module.__spec__.loader_state = loader_state | |||
module.__class__ = _LazyModule | |||
|
|||
|
|||
def load_source(module_name, filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is misleading because you're not loading from source, but from a file that contains source code. I think you're after something like load_source_path()
.
You're also missing whether this is a package? If that's not possible for simplicity, then that should probably be stated. Otherwise it's a piece of information that a module spec contains.
Lib/importlib/util.py
Outdated
def load_source(module_name, filename): | ||
"""Load a module from a filename.""" | ||
loader = SourceFileLoader(module_name, filename) | ||
module = types.ModuleType(loader.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should get the module via the spec.
Lib/importlib/util.py
Outdated
"""Load a module from a filename.""" | ||
loader = SourceFileLoader(module_name, filename) | ||
module = types.ModuleType(loader.name) | ||
module.__file__ = filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be set by the loader, not manually. And you're missing the spec.
When you're done making the requested changes, leave the comment: |
I don't know how anyone else feels, but we can see if there's any other responses. One key thing, though, is the current implementation in the PR is incomplete as it's missing a spec. It also has to decided whether supporting packages is important or not. |
As an user, I have no idea of what a spec is. I have no idea how to use the importlib API correctly. So I copy/paste random recipes on the Internet and hack them for my needs. That's why I like the idea of a clean API to use the importlib API to "load a Python source and import it". I don't want to load a package. For me, the most common case would be that it's a .py file. I don't see how i would load a package from a filename, since it's a directory. Maybe we need a test to ensure that it doesn't work. |
c3a44b9
to
fb03867
Compare
I updated my PR:
New implementation: def load_source_path(module_name, filename):
"""Load a module from a filename."""
loader = SourceFileLoader(module_name, filename)
spec = spec_from_loader(module_name, loader)
module = module_from_spec(spec)
sys.modules[module.__name__] = module
loader.exec_module(module)
return module |
Another imp.load_source() user who doesn't know how to update the code to Python 3.12 importlib API: https://discuss.python.org/t/how-do-i-migrate-from-imp/27885 |
https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec 😉 Joking aside, it is documented.
Then you probably should make sure the name doesn't have a
As I said over on discuss.python.org, Miro's ask is actually different from yours. You want to load from a specific path, Miro wants to search and load from a specific directory (note how your use case lacks a search component). |
A package cannot be imported by its directory path, whereas its | ||
``__init__.py`` file (ex: ``package/__init__.py``) can be imported. | ||
|
||
.. versionadded:: 3.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR has no needs backport to 3.12
label. Should the version be 3.13 instead?
.. versionadded:: 3.12 | |
.. versionadded:: 3.13 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it seems there is no support for the backport to satisfy a specific use case:
As I said over on discuss.python.org, Miro's ask is actually different from yours. You want to load from a specific path, Miro wants to search and load from a specific directory (note how your use case lacks a search component).
I propose to add this function to Python 3.12 to replace the removed imp.load_source(). If it's too late, it can be added to Python 3.13. But it's unpleasant that Python 3.12 users have no solution (in form of a single function call) in the stdlib: otherwise, people will copy/paste these 3-5 lines to their code, and adding a function to Python 3.13 stdlib becomes less relevant? |
8eec111
to
f134e84
Compare
I updated my PR:
Ok, I added a check to reject names which contain a dot character. |
My implementation is different than Python 3.11 imp.load_source() if the module is already cached in Python 3.11 imp.load_source(): def load_source(name, pathname, file=None):
loader = _LoadSourceCompatibility(name, pathname, file)
spec = util.spec_from_file_location(name, pathname, loader=loader)
if name in sys.modules:
module = _exec(spec, sys.modules[name])
else:
module = _load(spec)
# To allow reloading to potentially work, use a non-hacked loader which
# won't rely on a now-closed file object.
module.__loader__ = machinery.SourceFileLoader(name, pathname)
module.__spec__.loader = module.__loader__
return module See the |
A different design would be to no use If load_source_path() returns the module in Moreover, the function disallows the usage of dots in the module name, which makes the name collision even more likely. |
Examples of projects using imp.load_source():
|
There is no "correct" semantics since this isn't a typical use case. It's trying to work outside of the import system to accomplish something while still using the import system to do some of the work. I think the key question is why would someone want this functionality? If it's to import some module in an odd spot, then it should go into There's a reason why I haven't put something like this into |
The more I dig into the details, the more I understand why it's hard to design a single function implementing "exclusive" use cases, especially the question about caching the module in
Morever, here we are already about replacing existing imp.load_source() calls. But also code written manually (using the imp and importlib modules). For all these reasons, I give up on my attempt to add |
📚 Documentation preview 📚: https://cpython-previews--105755.org.readthedocs.build/